Client server networks

ABSTRACT

A client node in a network communicates with a server system having a plurality of servers. The client has a distributor which periodically measures server activity, assesses relative loading of the servers, and adjusts the request distribution as well as the server loadings in accordance with the loading assessment.

This invention relates to client server networks and, in particular, tonetworks in which clients have multiple alternative servers that canhandle their requests.

In such client server networks, a problem arises in determining to whichof the server nodes a client should send a given request. Known methodsof dealing with this problem include load sharing, in which the load isspread across the multiple servers on a round robin basis, and workerstandby schemes in which a server is designated as the worker to whichrequests are usually sent but an alternative server, the standby,handles requests when the worker is unable to. When the worker is againable to handle requests, the requests are again sent to it. However,neither of these approaches is satisfactory. Round robin load sharing isunsuitable if the capacities of the various servers differ significantlyas the approach does not take relative server size into account.Worker-standby approaches are unsuitable if the request volume exceedsthe capacity of a single server.

The effective capacity of any given server can vary considerably overtime, for example because the server is handling requests from otherclients, or for other reasons such as maintenance activity, such asarchiving.

The present invention aims to address the problems outlined above and toprovide an improved handling of requests to multiple servers.

According to the invention there is provided a method of distributingrequests from a client node to servers of a server system having aplurality of servers, the method comprising periodically repeating thesteps of:

measuring the activity of each server;

assessing the relative loadings of the plurality of servers from themeasured server activities; and

adjusting the distribution of requests to individual servers of theplurality of servers in accordance with the assessment of relativeloadings;

wherein adjusting the distribution of requests to individual serverscomprises adjusting the proportion of requests assigned to each serveras a function of measured server activity, the mean activity across theplurality of servers, and the existing proportion of requests assignedto the server.

The invention also provides a system for distributing requests from aclient node to servers of a server system having a plurality of servers,comprising a distributor at the client node, the distributor includingmeans for measuring the activity of each server, means for assessing therelative loading of the plurality of servers from the measured serveractivities, and means for adjusting the distribution of requests toindividual servers of the plurality of servers in accordance with theassessment of relative loadings;

wherein the distribution adjustment means comprises means for adjustingthe proportion of requests assigned to each server as a function ofmeasured server activity, the mean activity across the plurality ofservers and the existing proportion of requests assigned to the server.

Embodiments of the invention have the advantage that load distributioncan be even across servers, taking into account their relativecapacities.

Preferably, the client node comprises a load controller for applying aload control algorithm to the requests distributed to all servers of theplurality of servers. This has the advantage that loads can becontrolled to prevent all the servers overloading and that thedistributor makes the plurality of servers look like a single server.

Preferably, server activity is measured by counting the number of serverrequests and responses over a period. Preferably, this period is anaggregate number of requests.

Preferably, the assessing of relative loading comprises comparing theserver requests and responses for each of the servers over the sampleperiod.

Preferably, the request distribution of the server is adjusted accordingto a function of measured server activity, mean activity across all theservers and the existing proportion of requests assigned to the server.

Preferably, an override facility is provided whereby a request relatedto a previous request is sent to the same server as the previousrequest.

Embodiments of the invention will now be described, by way of example,and with reference to the accompanying drawings in which:

FIG. 1 shows a schematic diagram of a request distribution systemembodying the invention;

FIG. 2 is a UML use case diagram for the distributor of FIG. 1;

FIG. 3 is a UML class diagram for measurement of server loadings toassess relative loadings;

FIG. 4 is an example sequence for a Distribute use case;

FIG. 5 is an example sequence for a Response use case;

FIG. 6 is an example sequence for a Measure use case;

FIG. 7 is an example sequence for a Re-evaluate use case;

FIG. 8 is a UML activity diagram for Distribute use case;

FIG. 9 is a graph of responses and requests for a relatively largeserver showing the effect of overloading;

FIG. 10 is a graph of requests and responses for a relatively smallserver, showing the effect of overloading;

FIG. 11 is a flow chart showing a first embodiment of activitymeasurement;

FIG. 12 is a flow chart showing a second embodiment of activitymeasurement; and

FIG. 13 is a flow chart showing a third embodiment of activitymeasurement.

Referring to FIG. 1, the embodiment to be described is based on arequest distributor unit 10, within each client node 12. The requestdistributor receives both local requests 14 generated internally withinthe client node and external requests 16 received over an externalinterface. The distributor distributes server requests among n servers,here shown as server 1, 18, server 2, 20 and server n, 22. The clientnode also comprises a load control unit 24 which runs a load controlalgorithm. The distributor ensures that the remote servers are evenlyloaded. The distributor is effectively front ended by the load controlalgorithm.

The architecture illustrated in FIG. 1 enables the same load control tobe used for external requests and requests originating locally at theclient node. Also, the distributor makes the server network look like asingle node to the load control algorithm. It can be seen from FIG. 1than the distributor is positioned, logically, between the servers andthe load control.

Distributing the load to the servers on a request by request basis doesnot work when a request is related to an earlier request. Thedistributor includes an override facility which enables some subset ofrequests to be sent to the same server. This is typically necessarywhere the server stores content derived from the first request that isneeded to process the subsequent requests.

The UML use cases for the distributor are shown in FIG. 2. The localapplication or remote client 26, time 28 and remote server 30 are shownas being remote from the system. The diagram shows that the distributordistributes 31 server requests in accordance with responses 32 from theservers and some measure 34 based on distribution and response. Thedistribution is re-evaluated over time 36.

Irrespective of the origin of the server request, that is external orinternal to a client node, three requirements for distributing serverrequests to a set of available servers may be identified as follows:

-   -   1. To load the available servers evenly, as far as possible;    -   2. To avoid high frequency oscillations in server load by not        overreacting to past load imbalances, causing new load        imbalances; and    -   3. To allow override requests to insist on a specific server        despite the distribution algorithm.

The override request of requirements is invoked when a subsequentrequest requires context data that would have been created in the serverthat handled a previous request. The response to the previous requestindicated which server handled it.

The distribution used should preferably be independent of the particulardetails of the client-server protocol to ensure wide applicability. Inthe embodiments to be described, the measurements for different serversare compared to assess their relative loadings. The distribution ofsubsequent requests is then adjusted to converge the server measurementto their mean. Thus, the loadings of the servers are rebalanced. Thisapproach has no idea of whether the individual servers are overloaded asit is a relative measure of loading and so does not provide a completecontrol mechanism. If there is a concern at a client node to avoidoverloading servers, the client node may incorporate a separate loadcontrol algorithm in front of the distributor for example as shown at 24in FIG. 1. The load control algorithm works from measurements of theaggregate traffic to all servers. A load control scheme suitable for aserver node terminating request traffic should be equally suitable for aclient node having a request distributor. It should be understood thatload control is not essential.

The approach outlined above is based on measurements of relativeloadings. It would be possible to measure traffic related quantitiessuch as request and response counts and their rate derivatives, responsetime distributions (such as mean, 95^(th) percentile, and standarddeviation), and numbers of requests with responses pending. Althoughthese are all universal measurements, they suffer to a greater or lessdegree from the problem that the threshold separating normal fromoverload conditions are usually protocol and/or application specific.For example, what constitutes a satisfactory response time for oneapplication is likely to be unacceptable for another. Thus, a designbased upon testing measurements against absolute values isunsatisfactory.

In addition to the approach based on measurements of relative loadings,some protocols include an explicit facility for a server to rejectclient requests when it is overloaded. Distinguishing and counting theserejected requests at the distributor provides an additional set ofmeasurements which can be used to control distribution.

FIGS. 3 to 7 show generalised UML class and sequence diagrams for thedistributor. FIG. 3 is a UML class diagram showing three entities: theserver 38, request 40 and response 42. The server has the attributes ofmeasurement, proportion, the request has the attributes of Timestamp,transaction ID and Override, and the response has the attributes ofTimestamp and transaction ID. The server has the operations read andupdate.

FIG. 4 shows the sequence for the Distribute use case in which theclient or local application and Remote Server 42, 44 are shown outsidethe distributor system. Requests 46 are sent to the distributor and aread operation 48 performed on every server object instance to obtaincurrent proportions, and which includes receiving responses 50 from theserver objects. The requests are then distributed 52 to the appropriateremote server 44 depending on the relative loading.

FIG. 5 shows the sequence for the response use case. Here, the remoteserver 44 sends responses to the Distributor_Control which sends theresponses to the client or local application 42. FIG. 6 shows thesequence for the Measure use case in which updates to measurements 51are made between Distribution_Control and Server_Objects and replies 56returned to the Distribution_Control.

FIG. 7 shows the sequence for the Re-evaluate use case. The outsidecontrol here is time 58. On expiry of a timer, which is a periodicevent, read measurements are made by the Distributor from the ServerObjects and an update made. Both receive replies from the server objectsand both are invoked on every instance of server object.

The distribution algorithm must not be disrupted by addition and removalof servers. A distribution by parts scheme is preferred although otheralgorithms are possible. Under the preferred scheme, the server'sproportions are expressed as a ratio according to capacity. For example,in a three server case where one of the servers has twice the capacityof other two, the ratio would be expressed as 1:2:1.

The number of parts assigned to each server is P1, P2, P3 . . . Pn. Inthe 1:2:1 case above, P1=1, P2=2 and P3=3. The distribution algorithmadds these numbers cumulatively from the first server to the last,recording the server identity against each interval. Thus, the rangeP1-P2 is assigned to server 2, P2-P3 to server 3 and so on. For eachserver request a random number is generated in the range 0 toΣPr(r=1−n). The server within whose range the random number lies is theserver to which the request is sent.

Where the random number generated is equal to one of the intervalboundary values, a tie breaker rule is applied.

Where the distributor starts up, it assumes that all servers have thesame capacity and assigns the same proportion to each. Rather thanapplying the value 1, a larger value such as 10 is applied to allow fordownward, as well as upward adjustments.

FIG. 8 shows a UML activity diagram for the Distribute use case. At 60,Value B is set to zero. Within box 62, a server range is iterated foreach server. This involves the steps of reading a proportion P from eachserver object 64, assigning a range B to (B+P−1) to the server 66 andthen resetting the value of B to B=B+P at 68. A random number R is thengenerated at 70 and at 72 a value X is set such that X=R mod B. at 74,the server whose range includes X is selected and at 76 a request issent to that server.

In the foregoing description, it has been established that the selectionof which server to send a request to should be based on a relativemeasurement of server loadings and that distribution is achieved usingan algorithm based on server capacities. The following descriptionconcerns the measurement of server loading, and describes three possibleways of measuring server loadings; response time measurements,outstanding response count; and request and response count. The latteris especially preferred but the former two methods are possible. Othermethods are also possible and will occur to those skilled in the art.

Response Time Measurements

The distributor can calculate the response time distributions of eachserver. The relative loadings are then evaluated by direct comparisonsof the chosen measure, for example the mean or 95^(th) percentile. As aserver becomes overloaded it should exhibit a sharp increase in itsresponse time. This solution is illustrated in the flow chart of FIG. 12which shows response time measurement at step 120 which is performed forall servers, followed by a comparison of the distribution of responsetimes at step 122. This approach, although possible, suffers from anumber of problems:

First, individual servers may differ substantially in their naturalresponse times when working normally, making it more difficult todistinguish overloaded servers. Secondly, a certain minimum number ofresponse time samples must be collected before their statisticaldistribution can be assessed reliably. This may make the sample periodstoo long and the distributor too sluggish in its reaction to changes.Third, the distributor has to associate responses with the relevantrequests to obtain timings.

Outstanding Responses Counting

As an alternative, the distributor maintains a count of the number ofrequests sent to each server that has not received at least oneresponse. As this is a continuous measurement, periodic evaluation couldbe based either on fixed intervals or aggregate numbers of request. As aserver becomes overloaded it should show an increase in its outstandingresponses. TCP load control is based on this principle. The process isillustrated in FIG. 13 which shows the counting of requests which havereceived no response at step 124 and the comparison of these counts overa period of time or aggregate number of requests at step 126. However,there are two potential problems. First, the approach is indirectlyaffected by any normal differences between server response times. Thisfollows from the basic queuing theory formula which states that:mean concurrency=transactions per second (tps)×mean response time.

Thus, it is again difficult to distinguish distressed servers fromnaturally slow servers.

Second, the distributor has to distinguish initial responses to requestsfrom subsequent responses in order that the outstanding response countis only decremented when an initial response is sent. This is difficultto achieve.

Request and Response Counting

This option is the most preferred of the three considered. Thedistributor records, over a short sample period, the number of requestssent to each server and the number of responses received back from eachserver. The relative loadings are then evaluated by comparing requestand response counts for the sample period. The method is illustrated inFIG. 11 in which server requests are counted at 100, server responsescounted at step 110 and a comparison made at step 112. This process isrepeated for each server.

The absolute numbers of requests or responses may differ between serversdue to the differing capacities. However, provided that none isoverloaded the request/response ratio for each should be almostidentical. This relies on each server receiving the same mixture ofrequest types, which is true over a large number of requests but may notbe reliable over short periods containing few requests. It is preferredtherefore, to define samples by aggregate number of requests sent,rather than by time periods.

There are many candidates for the values that might be compared todecide relative loading. A straight ratio of requests to responses isonly one alternative. The following is a non-exhaustive list of somecandidate quantities and their theoretical ranges. In the list Rq is thenumber of theoretical requests sent to a server and Rs the number ofresponses returned.

1. Rq/Rs : 0

+ ∞ 2. Rs/Rq: 0

+ ∞ 3. Rq-Rs: −∞

+ ∞ 4. Rs-Rq: −∞

+ ∞ 5. Rq/(Rc+Rs): 0

1 6. Rs/(Rs+Rc): 0

1 7. (Rq-Rs)/(Rq+Rs): −1

+1 8. (Rs-Rq)/(Rq+Rs): −1

+1

Of these candidate values, nos 1 to 4 have unlimited ranges, caused bythe possibility of sample periods with zero responses or requests.Options 5 and 6 give greater emphasis to either the request or responsecount. Theory suggests that it is the normalised difference betweenrequests and responses counts that is the most informative indication ofrelative server loadings, suggesting that options 7 and 8 are the mostpreferable. The polarity of the value is one consideration, that iswhether it increases or decreases with loading. It is intuitive that thevalue should increase as loading increases favouring option 7. However,as most protocols spawn, on average, slightly more than one response perrequest, the last option, option 8 is favoured as the servermeasurements would be positive under normal conditions.

FIGS. 9 and 10 show the request and response counts of two servers asthey are subjected to increasing traffic. The figures are both graphs ofmessages per second (mps) against time. The server of FIG. 9 has arelatively large capacity and that of FIG. 10 a relatively smallcapacity. The two servers have equal traffic shares. The figuresemphasise how the server's response rate plateaus as the server reachesits capacity and becomes distressed. FIGS. 9 and 10 are based on theassumption of an average 1.5 responses per request.

In FIG. 10, the smaller server starts to plateau at time t1. At thistime, the distributor should start to shift request traffic to thelarger server.

Distribution Adjustment

It was mentioned previously that it is important not to overcompensateany adjustment of the server distribution, so setting up oscillations.Periodically, or after a certain number of requests, the distributorwill need to adjust the proportions of request traffic assigned to eachserver. The objective is to reduce the discrepancies between the perserver values of the measurements chosen. This may be done as follows:

If M₁, M₂, M₃ . . . M_(n) are the measurements calculated for the lastsample period for each server, μ is their arithmetic mean, and P₁, P₂,P₃ . . . P_(n) their currently assigned proportions per server, the newproportion assigned to the rth server is in general given by a formulaof the following kind:P _(r) =f (M _(r) , μ, P _(r))Two candidate formulae and their theoretical ranges are as follows:(1+(M _(r)−μ)/(|M _(r)|+|μ|))×P _(c)0

2×P _(r)  1.(1−(M _(r)−μ)/(|M _(r)|+|μ|))×P _(c)0

2×P _(r)  2.

The difference between the two reflects the polarity choice referred toin the choice between options 7 and 8 above. These formulae produce newvalues proportional to the old and a proportion must never, therefore,be adjusted to zero due to arithmetic truncation.

Oscillations may be reduced by one of two ways. First, the data from thelast few sample periods may be explicitly combined to compute theadjustments to proportions for the next period. This effectively uses asliding window of retained data. Second, the data from all previoussamples may be implicitly combined using a decay formula. Under thisapproach, data for second period adjustment is the average of truesecond period data and data from the first period. Data used for thethird period is the average of true third period data and data used forthe second period etc.

The embodiment described enables distribution of requests to multipleservers from a client node in a manner which reflects the individualcapacities of the servers and based on their relative loadings.

Various modifications to the embodiments described are possible and willoccur to those skilled in the art. For example, other methods ofmeasuring server activity may be possible and other distributionadjustment formulae may be adopted. However, such modifications arewithin the scope of the invention which is defined by the appendedclaims.

1. A method of distributing requests from a client node to servers of aserver system having a plurality of servers, the method comprisingperiodically repeating the steps of: a) measuring server activity ofeach server to obtain measured server activities; b) assessing relativeloadings of the plurality of servers from the measured serveractivities; c) adjusting a distribution of requests to individualservers of the plurality of servers in accordance with the assessing ofthe relative loadings by adjusting a proportion of the requests assignedto each server as a function of the measured server activities, a meanactivity across the plurality of servers, and an existing proportion ofthe requests assigned to the respective server; d) the measuring stepbeing performed by counting a number of the server requests sent to eachof the plurality of servers, and counting a number of the serverresponses received from each server over a sample period; and e)configuring the sample period to be an aggregate number of requests sentto the plurality of servers.
 2. The method according to claim 1, whereinthe sample period is a period of time.
 3. The method according to claim1, wherein the assessing step is performed by comparing the serverrequests and responses for each of the plurality of servers over thesample period.
 4. The method according to claim 1, wherein the measuringstep is performed by calculating a response time of each server.
 5. Themethod according to claim 4, wherein the assessing step is performed bycomparing a distribution of the server response times for each of theplurality of servers.
 6. The method according to claim 1, wherein themeasuring step is performed by counting a number of requests for eachserver which have not received a response.
 7. The method according toclaim 6, wherein the assessing step is performed by comparing the numberof requests without responses over an aggregate number of requests toall the plurality of servers.
 8. The method according to claim 1,wherein the adjusting step is performed by combining distribution datafrom previous sample periods to calculate adjustments to distributionsfor a next sample period.
 9. The method according to claim 1, whereinthe adjusting step is performed by combining data from previous sampleperiods according to a decay formula.
 10. The method according to claim3, wherein the relative loadings are compared by comparing, for eachserver, a ratio (R_(q)−R_(s)) / (R_(q)+R_(s)) where R_(q) is the numberof requests sent to the server, and R_(s)is the number of responsesreturned from the server.
 11. The method according to claim 3, whereinthe relative loadings are compared by comparing, for each server, aratio (R_(s)−R_(q)) / (R_(q)+R_(s)) where R_(q) is the number ofrequests sent to the server, and R_(s) is the number of responsesreturned from the server.
 12. A system for distributing requests from aclient node to servers of a server system having a plurality of servers,comprising: a) a distributor at the client node, the distributorincluding means for measuring server activity of each server to obtainmeasured server activities; b) means for assessing relative loading ofthe plurality of servers from the measured server activities; c) meansfor adjusting a distribution of requests to individual servers of theplurality of servers in accordance with the assessing of the relativeloadings by adjusting a proportion of the requests assigned to eachserver as a function of the measured server activities, a mean activityacross the plurality of servers, and an existing proportion of therequests assigned to the respective server; d) the activity measuringmeans comprising a counter for counting a number of requests for eachserver which have not received a response; and e) the relative loadingassessing means comprising a comparator for comparing the number ofrequests without responses over an aggregate number of requests to allthe plurality of servers.
 13. The system according to claim 12, whereinthe means for measuring server activity comprises a counter for countinga number of the server requests sent to each of the plurality ofservers, and for counting a number of the server responses received fromeach server over a sample period.
 14. The system according to claim 13,wherein the relative loading assessing means comprises means forcomparing the server requests and responses for each of the plurality ofservers over the sample period.
 15. The system according to claim 12,wherein the activity measuring means comprises calculating means forcalculating a response time of each server.
 16. The system according toclaim 15, wherein the relative loading assessing means comprises meansfor comparing a distribution of the server response times for each ofthe plurality of servers.
 17. The system according to claim 13, whereinthe request distribution adjusting means comprises a combiner forcombining distribution data from previous sample periods to calculateadjustments to distributions for a next sample period.
 18. The systemaccording to claim 14, wherein the means for comparing the serverrequests and responses compares, for each server, a ratio (R_(q)−R_(s))/ (R_(q)+R_(s)) where R_(q) is the number of requests sent to theserver, and R_(s) is the number of responses returned from the server.19. The system according to claim 14, wherein the means for comparingthe server requests and responses compares, for each server, the ratio(R_(s)−R_(q)) / (R_(q)+R_(s)) where R_(q) is the number of requests sentto the server, and R_(s) is the number of responses returned from theserver.